A Posteriori Agreement as a Quality Measure for Readability Prediction Systems

نویسندگان

  • Philip van Oosten
  • Véronique Hoste
  • Dries Tanghe
چکیده

All readability research is ultimately concerned with the research question whether it is possible for a prediction system to automatically determine the level of readability of an unseen text. A significant problem for such a system is that readability might depend in part on the reader. If different readers assess the readability of texts in fundamentally different ways, there is insufficient a priori agreement to justify the correctness of a readability prediction system based on the texts assessed by those readers. We built a data set of readability assessments by expert readers. We clustered the experts into groups with greater a priori agreement and then measured for each group whether classifiers trained only on data from this group exhibited a classification bias. As this was found to be the case, the classification mechanism cannot be unproblematically generalized to a different user group.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Integrated Approach for Measuring Software Quality and Code Readability

In this paper, we explore the concept of code readability and investigate its relation to software quality[1]. This is a new approach to measuring the complexity of software systems[2]. Software industry uses software metrics to measure the complexity of software systems for software cost estimation, software development control, software assurance, software testing, and software maintenance [3...

متن کامل

Assessing Readability of Patient Education Pamphlets in Training Hospitals in the City of Mashhad

Background: Patient education is taken into account as one of the key components of comprehensive care as well as one of the significant nursing functions in order to increase community health. In this respect, education materials and written texts can improve patient information up to 50% and consequently meet patient satisfaction. Readability is considered as an integral concept in patient ed...

متن کامل

A NOVEL FUZZY-BASED SIMILARITY MEASURE FOR COLLABORATIVE FILTERING TO ALLEVIATE THE SPARSITY PROBLEM

Memory-based collaborative filtering is the most popular approach to build recommender systems. Despite its success in many applications, it still suffers from several major limitations, including data sparsity. Sparse data affect the quality of the user similarity measurement and consequently the quality of the recommender system. In this paper, we propose a novel user similarity measure based...

متن کامل

A readability level prediction tool for K-12 books

The readability level of a book is a useful measure for children and teenagers (teachers, parents, and librarians, respectively) to identify reading materials suitable for themselves (their K-12 readers, respectively). Unfortunately, majority of published books are assigned a readability level range, such as K-3, instead of a single readability level for their intended readers, by professionals...

متن کامل

Reliability, Readability and Quality of Online Information about Femoracetabular Impingement

Background: The Internet has become the most widely-used source for patients seeking information more about their health and many sites geared towards this audience have gained widespread use in recent years. Additionally, many healthcare institutions publish their own patient-education web sites with information regarding common conditions. Little is known about how these resources impact pati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011